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(57) Abstract: A distributed network search mechanism may be provided for consumers coupled to a network to search information 
providers coupled to the network. Consumers may make search requests according to a query routing protocol. A network hub 
may be configured to receive search requests from consumers. The hub may also receive registration requests from information 
providers according to the query routing protocol. Information providers register with the hub to indicate search queries in which 
they are interested in receiving. When a query request is received, the hub resolves the query request with a provider registration 
index. The hub matches search query information from the query request with provider registrations to determine which providers 
have registered to receive search queries like the current search query. The hub then routes the search query to matching providers 
according to the query routing protocol. 
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TITLE: DISTRIBUTED INFORMATION DISCOVERY 



BACKGROUND OF THE INVENTION 

5 

1. Field of the Invention 

This invention relates to computer networks, and more particularly to a system and method for providing a 
distributed information discovery platform that enables discovery of information from distributed information 
providers. 

10 

2. Description of the Related Art 

It has been estimated that the amount of content contained in distributed information sources on the public 
web is over 550 billion documents. In comparison, leading Internet search engines may be capable of searching 
only about 600 million pages out of an estimated 1.2 billion "static pages." Due to the dynamic nature of Internet 
15 content, much of the content is unsearchable by conventional search means. In addition, the amount of content 
unsearchable by conventional means is growing rapidly with the increasing use of application servers and web 
enabled business systems. 

Conventional, crawler-based search engines such as Google are applicable for indexing static, slowly 
changing web pages such as home pages or corporate information pages. Crawlers currently may take three months 

20 or more to crawl and index the web (Google numbers). Targeted or restricted crawling of headline or other 
metadata is possible (such as that done by moreover.com). 

Some web resources may not have a "page of contents" or similar index. As an example, Amazon.com 
contains millions of product descriptions in its databases but does not have a set of pages listing all these 
• descriptions. As a result, in order to crawl such a resource, it may be necessary to query the database repeatedly with 

25 every conceivable query term until all products were extracted. Since many web pages are generated dynamically 
given information about the consumer or context of the query (time, purchasing behavior, location, etc.), a crawler 
approach is likely to lead to distortion of such data. In some situations, content may be inaccessible due to access 
privileges (e.g. a subscription site), or for security reasons (e.g. a secure content site). 

Conventional search mechanisms may be less efficient than desirable in regard to some types of providers. 

30 For example, consider the role of a crawler-type search mechanism accessing dynamic content from a news site. A 
current news provider may provide content created by editors and stored in a database as XML or other presentation 
neutral form. The news provider's application server may render the content as a web page with associated links 
using current templates. The end user may see a well-presented page with the story they were looking for. However, 
when a crawler-type search engine hits the page all it sees is a mess of HTML. In order to extract the content of the 

35 story, it must be programmed to use information about the structure of the HTML page to "scrape" the content and 
headline from the page. It may then store this content or a processed version for indexing purposes in its own 
database, and retrieve the link and story when a query matching the story is submitted. This process from database 
to HTML and back to database is inherently inefficient and prone to errors. In addition it gives the content provider 
no control over the format of the article or the decision about which article to show in response to a query. 

40 

1 
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SUMMARY OF THE INVENTION 

A distributed network search mechanism may be provided for consumers coupled to a network to search 
information providers coupled to the network. Consumers may make search requests according to a query routing 
protocol. A network hub may be configured to receive search requests from consumers. The hub may also receive 
registration requests from information providers according to the query routing protocol. Information providers 
r egister with the hub to indicat e search queries in which they are interested in receiving. To request registration, a 
provider may send a registration file to the hub according to the query routing protocol. The registration file 
includes an indication of query requests that the provider desires to receive and a query server address for the 
provider. The hub maintains an index of provider registrations . When a query request is received, the hub resolves 
the query request with the provider registration index. The hub matches search query information from the query 
request with provider registrations to determine which providers have registered to receive search queries like th e 
current search query . The hub then routes the search query to matching providers according to the query routing 
protocol. 

Each provider receiving a search query according to the query routing protocol may respond with search 
results according to the query routing protocol. Each provider is able to provide or customize its search results as it 
sees fit. The hub collates response from providers and sends the collated search in a query response to the consumer 
according to the query routing protocol. The query routing protocol specifies a mark-up language format for 
communicating query requests, query responses and registration requests. 

In an embodiment, a network hub may be configured to implement a search method according to a query 
routing protocol. The method may include r eceiving a query request from a consumer . The query request may 
include a search ouerv. The method may include Jgsek4afr tire figflr rh f| n firY V" th an ™dex of provider regis trations 
to select one or more provider registrations. The search query may then be routed to at least one provider specified 
by the one or more selected provider registrations. A query response may be received from said at least one 
provider. Jhe queryjesponse includes search results. Search results are routed to the consumer^ 

The query request and the query response are formatted according to a query routing protocol. The query 
routing protocol specifies a mark-up language format for communicating query requests and query responses. A 
search query may include an indication of a query-space. The query-space defines a structure for indicating and 
matching search criteria and search criteria structured according to the indicated query-space. 

Each provider registration may include an indication of a query-space. The query-space defines a structure 
30 \ifor indicating and matching search criteria. The provider registration may also include a predicate statement 
' structured according to the indicated query-space. The predicate statement defines matching search criteria. The 
provider registration may also include a query server address to which matching search queries are to be directed. 
/ Resolving a search query may include applying the search criteria from the search query to the provider 

^registrations indicating the same query-space as the search query, and selecting the provider registrations that have 
both the same query-space as said search query and a predicate statement matching the search criteria from the 
search query. The search query may be routed to the query server addresses specified by one or more of the 
provider registrations selected by said resolving. 

Registering providers may involve receiving registration requests from a plurality of providers. Each 
registration request includes a registration file. The registration file includes an address and a definition of search 
queries to be sent to the address. The registration files may be stored in the index of provider registrations. The 
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registration requests, the query request and the query response are all formatted according to a query routing 
protocol, wherein the query routing protocol specifies a mark-up language format for communicating query requests, 
query responses and registration requests. 

BRIEF DESCRIPTION OF THE DRAWINGS 

Figure 1 illustrates a network utilizing the distributed infonnarion discovery platform according to one 
embodiment; 

Figure 2 illustrates an architecture for the distributed information discovery platform according to one 
embodiment; 

Figure 3 illustrates message flow in a distributed information discovery network according to one 
embodiment; 

Figure 4 illustrates a provider with a query routing protocol interface according to one embodiment; 

Figure 5 illustrates a provider with a query routing protocol interface and a results presentation mechanism 
according to one embodiment; 

Figure 6 illustrates an exemplary distributed information discovery network including a plurality of hubs 
according to one embodiment; 

Figure 7 illustrates provider registration in a distributed information discovery network according to one 
embodiment; 

Figure 8 is a flowchart illustrating message flow in a distributed information discovery network according 
to one embodiment; 

Figure 9 illustrates an example of several peers in a peer-to-peer network according to one embodiment; 
Figure 10 illustrates a message with envelope, message body, and optional trailer according to one 
embodiment; 

Figure 11 illustrates an exemplary content identifier according to one ernbodfment; 
Figure 12 is a block diagram illustrating two peers using a layered sharing policy and protocols to share 
content according to one embodiment; 

Figure 13 illustrates one embodiment of a policy advertisement; 
Figure 14 illustrates one embodiment of a peer advertisement; 
Figure 15 illustrates one embodiment of a peer group advertisement; 
Figure 1 6 illustrates one embodiment of a pipe advertisement; 
Figure 17 illustrates one embodiment of a service advertisement; 
Figure 18 illustrates one embodiment of a content advertisement; and 

Figure 19 is a block diagram illustrating one embodiment of a network protocol stack in a peer-to-peer 
platform. 

While the invention is described herein by way of example for several embodiments and illustrative 
drawings, those skilled in the art will recognize that the invention is not limited to the embodiments or drawings 
described. It should be understood, that the drawings and detailed description thereto are not intended to limit the 
invention to the particular form disclosed, but on the contrary, the intention is to cover all modifications, equivalents 
and alternatives falling within the spirit and scope of the present invention as defined by the appended claims. The 
headings used herein are for organizational purposes only and are not meant to be used to limit the scope of the 

3 
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description or the claims. As used throughout this application, the word "may" is used in a permissive sense (i.e., 
meaning having the potential to), rather than the mandatory sense (i.e., meaning must). Similarly, the words 
"include", deluding", and "includes" mean including, but not limited to. 

5 DETAILED DESCRIPTION OF EMBODIMENTS OF THE INVENTION 

A system and method for providing a distributed information discovery platform that enables discovery of 
up-to-date information from distributed information providers is described. In an embodiment, in contrast to 
conventional search engines and exchanges, the distributed information discovery platform does not centralize 
information; rather it may search for information in a distributed manner. This distributed searching may enable 

1 0 content providers to deliver up-to-the-second responses to search queries from a user or client 

In the distributed information discovery platform, queries are distributed to "peers" in a network who are /"Sp 
most likely to be capable of answering the query. The distributed information discovery platform provides a 
common distributed query mechanism for devices from web servers and small computers. The distributed 
information discovery platform may be applied to a wide variety of domains: from public accessible web search to 
. 15 private networks of trading partners to interaction between distributed services and applications, for example. 

The distributed information discovery platform may be applied in a wide variety of domains, including, but 
not limited to: public accessible web search, private networks of trading jaa imers, and interaction between 
distributed services and applications. The distributed information discovery platform may also be applied to Peer-to-/^.* 
Peer (P2P) networking, exemplified in programs such as ftfepster and Gnutella\ For example, in one embodiment V J 

20 the distributed information discovery platform may include a web front end to a distributed set of servers, each 
r unnin g a P2P node and responding to queries. Each node may be registered (or hard coded in some embodiments) 
to respond to certain queries or kinds of queries. For example, one of the nodes may include a calculator service 
which would respond to a numeric expression query with the solution. Other nodes may be configured for file 
sharing and may be registered to respond to certain queries. A search query on a corporate name may return an up- 

25 to-me-minute stock quote and current news stories on the corporation. Instead of presenting only text-based search 
results, the distributed information discovery platform may return other visual or audio search results. For example, 
a search query for "roses" may return photo images of roses. 

In some embodiments, the distributed information discovery platform may leverage web technologies (e.g. 
HTTP/XML). These technologies may provide a more familiar environment to developers and webmasters than less 

30 common or proprietary protocols. In addition, leveraging such web technologies may simplify a user's task in 
interfacing to the query routing protocol of the distributed information discovery platform. 

The distributed information discovery platform may provide an abstract query routing service for networks 
with arbitrary messaging and transport mechanisms. In one embodiment, the distributed information discovery 
platform may bind with the Web (e.g. XML over HTTP). In one embodiment, the distributed information discovery 

35 platform may bind with a peer-to-peer networking environment. In a peer-to-peer networking environment, entities 
of the distributed information discovery platform (e.g. consumers, providers, hubs, registration services, etc.) may 
be implemented on peers in the network. Each peer may run instances of the provider, consumer and registration 
services on top of its peer-to-pee r networking core. Each peer may interact with an instance of a hub service, itself 
running on top of the peer-to-peer networking core. One peer-to-peer networking environment with which the 

40 distributed information discovery platform may bind is implemented with a novel open network computing platform 
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for peer-to-peer networks, which may be referred to as a peer-to-peer platform. This peer-to-peer networking 

environment is described later in this document. 

There may be some differences in some of the internal mechanics of embodiments that bind to different 

networks. In general, the query routing protocol and the resolution mechanism may be the same or similar in the 
5 different embodiments. The routing mechanism and the client interfaces in the different embodiments, however, may 

be implemented at least partially differently to support the different network types. 

In addition to functioning as a ^eta-search " engine, the distributed information discovery platform may 

include support for an open protocol for distributed information routing. This protocol for distributed information 

routing may be referred to as a query routing protocol. The query routing protocol may define mechanisms for 
10 sending and responding to queries in the network, in addition to mechanisms for defining metadata, for nodes in the 

network. In one embodiment, the common query routing protocol may allow participants to exchange information 

seamlessly without having to understand the structure of their presentation layers. In one embodiment, this query 

routing protocol allows information providers to publish a description of queries that they are willing to answer. 

Information consumers may submit queries to the network, which routes each query to all interested providers. The 
15 query routing protocol allows participants in the network to exchange information in a seamless manner without 

having to understand the structure of the presentation layers. 

The query routing protocol may be based on existing open standards, including markup languages such as 

XML (extensible Mark-up Language) and XML Schema. In addition, the query routing protocol may be 

encapsulated within existing protocols, such as HTTP (HyperText Transfer Protocol). 
20 In some embodiments, the query routing protocol of the distributed information discovery platform may 

provide an interface designed for simplicity. For example, a rrunimally-conforming client implementation may be 

built in one embodiment using existing libraries for manipulating XML and sending HTTP messages. A miriimalry- 

corLfonning server implementation may be built in one embodiment with the above tools plus a generic HTTP 

server. 

25 • . The query routing protocol of the distributed information discovery platform may also provide structure. 

For example, in one embodiment, queries on a distributed information discovery network may be made using XML 

messages corifonning to a particular schema or queryspace. In such an embodiment, information providers may 

register "templates" describing the structure of queries to which they are willing to respond. 

The query routing protocol of the distributed information discovery platform may also provide extensibility. 
30 In some embodiments, arbitrary schemas or queryspaces may be used on a distributed information discovery 

network. In such embodiments, there may be no need for centralized schema or queryspace management. Thus, ad 

hoc collaboration may be simplified. 

The query routing protocol of the distributed information discovery platform may also provide scalability. 

For example, in one embodiment, a distributed information discovery network may support millions of publishers 
35 and consumers performing billions of transactions per day. In some embodiments, sophisticated implementations 

may take advantage of advanced connection-management features provided by lower-level protocols (e.g. 

HTTP/1.1). 

Some embodiments of the distributed information discovery platform may be used for two complementary 
search types: wide and deep. The concept of the expanded web covers both wide search of distributed devices (e.g. 
40 PCs, handheld devices, PDAs, cell phones, etc.) and deep search of rich content sources such as web servers. 
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In one embodiment, the distributed information discovery platform may be used to provide **wide search" 
on the web. Within the context of wide search, the distributed information discovery platform may provide an 
efficient mechanism for distributing queries across a wide network of peers. The alternative to this is each peer 
sending its queries to all the peers it knows about, which becomes expensive and inefficient in terms of bandwidth 
5 usage and search speed. The other extreme of this fully distributed search would be to have a server peer which 
handled all queries for all peers in the network. This would be inefficient since the server peer would become 
bottlenecked by all the queries, and would be a single point of failure in the network. The compromise employed by 
the distributed information discovery platform is of a series of "hub" peers each of which handles the queries for a 
group of peers. Each hub peer may specialize in an attribute such as geography, peer content similarity or 

10 application. Hub peers may forward queries to other hub peers either if they cannot satisfy the query or if it is 
desirable to expand the search to the widest number of peers possible. 

In one embodiment, the distributed information discovery platform may be used to provide "deep search" 
on the web. "Deep search" may find information embedded in large databases such as product databases (e.g. 
Amazon.com) or news article databases (e.g. CNN). In one embodiment, rather than crawling such databases, 

15 indexing and storing the data, the distributed information discovery platform may be used to detennine which 
queries should be sent to such databases and direct these queries to the appropriate database providers). The 
database provider's own search capabilities may be employed to respond to the query through the distributed 
information discovery platform. Thus, the resulting search results may be more up-to-date and have wider coverage 
than a set of conventional crawler search engine results. 

20 The ability to search recently updated information may make the distributed information discovery platform 

well suited for "deep search." Crawler based search engines may rec^iire three or more months to crawl and index 
the web. Given that the "deep web" may be at least 10 times larger than the crawlable web, the time to crawl and 
index such resources may be prohibitive given the frequency at which many of "deep web" resources update (e.g. 
every 30 days). Although targeted or restricted crawling of head-line or other meta data is possible, such targeted or 

25 restricted crawling may still be an inefficient method for many deep searches, due to the "access" issue. 

Unlike static web pages, deep web resources do not typically have a "page of contents" or similar index. 
As an example, Amazon.com contains millions of product descriptions in its databases but does not have a set of 
pages listing all these descriptions. As a result, in order to crawl such a resource, it may be necessary to query the 
database repeatedly with every conceivable query term until all products were extracted. In addition, since many 

30 such pages are generated dynamically given information about the consumer or context of the query (e.g. time, 
purchasing behavior, location, etc.), a crawler approach may lead to distortion of such data. For such reasons, some 
database providers may be resistant to allowing other search engines query their databases remotely. Finally, in 
some situations, content may be inaccessible due to access privileges fe.g. a subscription site) , or for security 
reasons (e.g. a secure content site). The distributed information discovery platform may leverage remote access or 

35 public search capabilities provided by information providers. Furthermore, under the distributed information 
discovery platform, a provider that wishes to restrict remote access may still allow searching and control how its 
content is searched by registering with a distributed information discovery network. 

Conventional search mechanisms may be less efficient than desirable in regard to some types of providers. 
In addition, conventional search mechanisms typically give the content provider no control over the format of the 

40 data or the decision about which article to show in response to a query. In contrast, the distributed information 
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discovery platform specifies a common query routing protocol which may give both parties more flexibility and 
control of the exchange of data, which may improve search efficiency in some embodiments. 

Figure 1 illustrates a network that utilizes the distributed information discovery platform according to one 
embodiment. The distributed information discovery platform may be applied to create a distributed information 
5 discovery network having three main types of participants. Information providers, or simply providers 120, may 
each register a description of itself on the distributed information discovery network, and then wait for requests 
matching information in the description. A provider 120 may be defined as anything that responds to requests 
(queries) in the network. A provider 120 may be, for example, a peer in a peer-to-peer network or a Web server 
such as cnn.com In one embodiment, providers 120 may register by sending registration information to the hub. 

10 The registration information may include metadata describing the types of queries which the provider 120 may be 
able to respond to. In one embodiment, the registration information may be maintained in a registration repository 
that may include registration information for a plurality of providers 1 20. 

Consumers 140 may query the network and wait for responses from providers 120. A consumer 120 may 
be defined as anything that makes requests in the network. A consumer 140 may be, for example, a peer in a peer- 

15 to-peer network or a web site with an HTTP client interface to the network. In one embodiment, a "search button + 
results" interface item or items may be added to web pages of web sites that may invoke the search capabilities 
provided by the distributed information discovery platform A network routing system, referred to as a hub 100, may 
handle query and response routing in the network. A hub 100 may act as an access point that may provide virtual 
access to the entire distributed information discovery network. Hubs 100 may facilitate efficient query routing over 

20 the network by handling message. routing between consumers 140 and providers 120. Providers 120 may register a 
description with the hub 100 and wait for matching requests. Consumers 140 may query the network through the 
•hub 100 and await responses. 

A consumer 140 may initiate a query in the network. In one embodiment, the query may be sent to a hub 
100 nearest to the consumer 140. The hub 100 then determines one or more providers 120 of which the hub 100 is 

25 aware (e.g. that have registered with the hub 100) and that may be Qualified to process the query . In one 
embodiment, a hub 100 may include a resolver 102 which may handle the determination of qualified providers 120. 
Metadata the hub 100 has on the providers 120, including the provider descriptions registered with the hub 100, may 
be used to determine the qualified provider(s) 120. The hub 100 then may send the query to the providers) 120 it 
has determined to be qualified. Each provider 120 that receives the query may process the received query and senxL 

30 one or more re sponses to the hub 100 . The hub 100 may receive the responses and route them to the consumer 140 
that initiated the query. In one embodiment, a hub 100 may include a router 104 that handles the routing of queries 
to providers 120 and the routing of responses to consumers 140. Thus, the distributed information discovery 
platform allows information providers 120 to publish a description of queries that they are willing to answer. 
Information consumers 140 can submit queries to the network, which routes each query to all interested providers 

35 120. 

In many applications, a program or node may act as both provider 120 and consumer 140. Physically, a 
provider 120 may be, for example, either an individual computer or a load-balanced set of computers; a consumer 
140 may be, for example, an individual computer or a web service; and the network may encompass a cloud of 
machines. The term "computer" is not limited to any specific type of machine and may include ma i n fr a me s, servers, 
40 desktop computers, laptop computers, hand-held devices, PDAs, telephone or mobile phones, pagers, set-top boxes, 
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or any other type of processing or computing device. Providers 120 and consumers 140 may contact the network 
through a specific hub 100 implemented on one or more machines. A hub 100 may provide virtual access to the 
entire distributed information discovery network- In some embodiments, providers 120 and consumers 140 may 
contact different hubs 1 00. 

5 In some embodiments, the distributed information discovery platform may provide the following 

functionality: 

• The query routing protocol: a protocol for defining queries, responses and registrations. The query routing 
protocol may allow both structured, lightweight and efficient query message exchange. In one 
embodiment, the query routing protocol may be implemented in XML. 

10 • Queryspaces: Since providers may have widely differing kinds of content or resources in their datastores, 

the query routing protocol may be used to define "queryspaces" that may be used to define the structure of 
queries and the associated registration information for a provider 120. In one embodiment,* queryspaces 
may define the structure of a valid query that a provider 120 can process. In one embodiment, queryspaces 
may be implemented in XML. 

15 ♦ Registration^ providers 120 may register with a distributed information discovery network Registration 

2 information for a provider 120 may include queryspaces that define which queries the provider 120 may 

respond to. The registration, in one embodiment, may include an XML-based encoding of a logical 
statement characterized by a queryspace, optionally characterized by a schema. If no schema is specified a 
default schema for general keyword matching may be used, in one embodiment. 
20 • Query Formulation: users and end applications (consumers 140) may present queries to a distributed 

information discovery network as arbitrary XML, in one embodiment. Schema selection may be performed 
by HTTP header specification, in some embodiments. In one embodiment, queries presented by consumers 
140 may adhere to specific queryspaces. 

• Query Resolution: queries are resolved by a resolver 102 in the network .by matching query terms to 
25 registration terms. Providers 120 whose registration terms match the query terms are returned by the 

resolver 102. 

• Query Routing: queries are routed to the appropriate provider 120 by sending, for example, XML requests 
over HTTP. A router 104 sends the requests and awaits responses. In some embodiments, the router 104 
may continually monitor providers to determine availability and reliability. 

30 • Provider Responses: providers 120 respond to queries in, e.g., arbitrary XML that may include links to any ( 

results they have in their site. 

• Presentation: In one embodiment, the distributed information discovery network may not perform any 
presentation of the responses from providers 120. In this embodiment, the consumer may perform such 
presentation, e.g. either as a web page or as a client side user interface. In one embodiment, the distributed 

35 information discovery network may collate results from providers 120, perform ranking on the results with 

respect to the query and present them in HTML, for example. Thus, a general application or user 
(consumer 120) may be able to query a distributed information discovery network and act on the responses 
as it sees fit - for example a music file sharing application may receive results and sort them according to 
file size / connection rate. 



8 



1/21/2008, EAST Version: 2,1.0.14 



WO 02/091242 



PCT/US02/13577 



According to a distributed information discovery platform, providers may register (tescrrptions of queries 
which they desire to have routed to them. These registrations may be meta-data indexes. A user may send a search 
query to a distributed information discovery routing system. The query may be compared to the registrations (e.g. 
meta-data indexes). In one embodiment, the registrations may be stored in XML format describing a conjunctive- 
5 normal logic. Queries are then routed to providers matching the query. In some embodiments, matches may be 
determined according to a user or pre-defined relevance. 

An architecture for the distributed information discovery platform is shown in Figure 2, according to one 
embodiment In one embodiment, a consumer 140 may provide users an access point to a distributed information 
discovery network. A consumer such as consumer 140A may include a consumer query response protocol (QRP) 

10 interface 142. The consumer QRP interface 142 may send queries written in the query response protocol to the hub 
100 for query resolution and routing. After sending a query, the consumer QRP interface 142 may await responses 
from providers. In one ernbodiment, the queries may be received by a hub consumer QRP interface 108 of router 
104. In one embodiment, the consumer QRP interface 142 may also perform formatting of the responses for 
presentation to the end user or application. In one embodiment, consumers 140 may also include a user interface 

15 (e.g. a web user interface) to the hub (e.g. the router and/or resolver). In one embodiment, a consumer 140 may 
include a mechanism for ranking and presentation of query results. In one embodiment, this mechanism may be a 
component of the consumer QRP interface 142. Ranking methodology may be implicit in each queryspace, and may 
be returned as part of each response, in some embodiments. Some ranking schemes may require third-party 
involvement. 

20 In one embodiment, consumers such as consumer 1400 may not include a consumer QRP interface 142. 

These consumers may use a consumer proxy 110 to interface to the functionality of the hub 100. The consumer 
proxy 110 may perform translation of queries formatted in one or more query protocols supported by the consumers 
140 into queries in the query routing protocol. These queries may men be sent to the hub 100 for resolution and 
routing. In one embodiment, the queries may be received by a hub consumer QRP interface 108 of router 104. The 

25 consumer proxy 110 may also perform translation of query responses formatted in the query routing protocol into 
one or more protocols supported by the consumers 140. As shown, one or more consumers 140 may interface with 
the consumer proxy 110. 

In one embodiment, a provider such as provider 120A may include a provider query response protocol 
(QRP) interface 122 that may accept queries from the hub 100 in the query routing protocol and respond to the 

30 queries with query responses in the query routing protocol. The provider QRP interface 122 may perform 
translation of queries into provider-specific requests. In one embodiment, the provider QRP interface 122 may not 
perform any indexing or searching itself, but rather may call the appropriate indexing and/or searching interface of 
the provider 120, for example, a database search engine. In this embodiment, the provider QRP interface 122 may, 
if necessary, translate the queries from the query response protocol into a protocol that may be used by the 

35 appropriate indexing and/or searching interface of the provider 120. The provider QRP interface 122 may also, if 
necessary, translate the query responses from the protocol used by the appropriate indexing and/or searching 
interface of the provider 120 into the query response protocol. A provider QRP interface 122 may be, for example, 
a small modification of an existing search engine script (Java Server Page (JSP), Perl etc.) so that queries from a 
distributed information discovery network can be applied to the provider's search engine. 

9 
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information (e.g. XML-typed information) that may prevent collisions and improve search accuracy. Furthermore, in 
one embodiment, multiple metadata descriptors (called content advertisements) may be used to identify a single 
instance of shared content. Allowing multiple advertisements enables applications and services to describe content 
in a very personal, custom manner that may enable greater search accuracy in any language. 
5 The peer-to-peer platform's security model may be orthogonal to the concepts of peers, policies, peer 

groups 304, and pipes in the peer-to-peer platform In one embodiment, security in the peer-to-peer platform may 
include, but is not limited to: 

• credentials - a credential is an opaque token that may provide an identity and a set of associated 
capabilities; 

10 • authenticators - an authenticator is code that may receive messages that either request a new credential or 

request that an existing credential be validated; and 

• policies - security policies at both the network and content peer group level may provide a comprehensive 
security model that controls peer-to-peer communication as well as content sharing. 

In one embodiment, all messages may include a network peer group credential that identifies the sender of 
15 the message as a full member in good standing. In addition to this low-level communication credential, content peer 
groups may define membership credentials that define a member's rights, privileges, and role within the group and 
content access and sharing credentials that define a member's rights to the content stored within the group. 

One motivation for grouping peers together is to share content. Types of content items that may be shared 
include, but are not limited to, text files, structured documents such as PDF and XML files, and active content like a 
20 network service. In one embodiment, content may be shared among group members, but not groups, and thus no 
single item of content may belong to more than one group. In one embodiment, each item of content may have a 
unique identifier also known as its canonical name. This name may include a peer group universal unique identifier 
(UUID) and another name that may be computed, parsed, and maintained by peer group members. In one 
embodiment, the content's name implementation within the peer group is not mandated by the peer-to-peer platform. 
25 The name may be a hash code, a URI, or a name generated by any suitable means of uniquely identifying content 
within a peer group. The entire canonical content name may be referred to as a content identifier . Figure 11 
illustrates an exemplary content identifier according to one embodiment In one embodiment, a content item may be 
jyfraatised tn make the item's existence known and available to group members through the use of content 

30 Each peer group member may share content with other members using a sharing policy that may name or 

rely on a sharing protocol. The default content sharing protocol may be a standard peer group sharing protocol of 
the peer-to-peer platform. Higher-level content systems such as file systems and databases may be layered upon the 
peer group sharing protocol. In on embodiment, the peer group sharing protocol is a standard policy embodied as a 
core protocol. In one embodiment, higher-level content protocols are optional and may be mandated by a custom 

35 policy and not the peer-to-peer platform. 

Figure 12 is a block diagram illustrating two peers using a layered sharing policy and several protocols to 
share content according to one embodiment. Each peer 200 includes core services 210 and one or more high-level, 
optional services 220. Core services 210 may include peer group sharing software that may be used to access a local 
store 214 (e.g. sharable content). High-level services 220 may include such services as the content management 

40 services 222 and the search and index system services 224 of this illustration. The core services 210 and high-level 
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services 220 interface through a peer group sharing API 216 to the peer group sharing software 212. The peer 
group sharing software 212 on the two peers 200 may interface to each other using the low-level peer group sharing 
protocol 218. High-level services 220 may interface using higher-level protocols. For example, the content 
management services 222 on the two "peers may interface using peer group content management protocols 226, and 
5 the search and index system services 224 may interface using content search and indexing protocols 228. 

An instance of content may be defined as a copy of an item of content. Each content copy may reside on a 
different peer in the peer group. The copies may differ in their encoding type. HTML, XML and WML are 
examples of encoding types. These copies may have the same content identifier, and may even exist on the same 
peer. An encoding metadata element may be used to differentiate the two copies. Each copy may have the same 
10 content identifier as well as a similar set of elements and attributes. Making copies of content on different peers may 
help any single item of content be more available. For example, if an item has two instances residing on two 
different peers, only one of the peers needs to be alive and respond to the content request. In one embodiment, 
whether to copy an item of content may be a policy decision that may be encapsulated in higher-level applications 
and services. 

15 One embodiment of the peer-to-peer platform may provide a content management service. A content 

management service is a non-core (high-level) service that uses the peer group sharing protocol to facilitate content 
sharing. In one embodiment, the peer group sharing protocol does not mandate sharing policies regarding the 
replication of content, the tracking of content, metadata content (including indexes), and content relationship graphs 
(such as a hierarchy). In one embodiment, the content management service may provide these extra features. 

20 Items of content that represent a network service may be referred to as active content. These items may 

have additional core elements above and beyond the basic elements used for identification and advertisement. 
Active content items may be recognized by Multi-Purpose Internet Mail Extensions (MIME) content type and 
subtype. In one embodiment, all peer-to-peer platform active contents may have the same type. In one embodiment, 
the subtype of an active content may be defined by network service providers and may be used to imply the 

25 additional core elements belonging to active content documents. In one embodiment, the peer-to-peer platform may 
give latitude to service providers in this regard, yielding many service implementation possibilities. Some typical 
kinds of elements associated with a network service may include, but are not limited to: 

• lifecycle elements - an instance of active content may adhere to a lifecycle. A lifecycle- element defines a 
set of behavior states such as started and stopped. The lifecycle element may itemize the service's lifecycle 

30 and a set of instructions used to manipulate the lifecycle; 

• runtime elements - runtime elements define the set of local peer rimtimes in which this active content can 
execute (e.g. Java, Solaris, Win32....); 

• user interface elements - a user interface element defines the policy or policies by which a user interface is 
displayed; 

35 • configuration elements - a configuration element defines the policy or policies by which the service may be 

configured; and 

• Storage elements - a storage element defines the policy or policies the service may use for persistent and/or 
transient storage. 
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As previously discussed, each peer may have a core protocol stack, a set of policies and one or more 
services. In one embodiment, the peer-to-peer platform may define a standard service advertisement. In one 
embodiment, the standard service advertisement may include lifecycle, runtime, and configuration elements. 

Some services may be applications. An application may have a user interface element and a storage 
5 element in addition to the lifecycle, runtime, and configuration elements. In one embodiment, a service 
advertisement may also include startup information. The startup information may direct the local core peer software 
as to how and when to start the service. For example, some services may be marked (in the advertisement) to start at 
boot, while others may be marked to start when a message arrives in a specific advertised pipe. In one embodiment, 
services marked to start when a message arrives in a specific advertised pipe may be used to implement daemon 

10 services that block in the background awaiting a message to arrive in an input pipe. 

In one embodiment, the peer-to-peer platform recognizes two levels of network services: peer services and 
peer group services. Each level of service may follow the active content typing and advertisement paradigm, but 
each level may provide a different degree (level) of reliability. In one embodiment, a peer service may execute on a 
single peer network node only. If that node happens to fail, the service fails too. This level of service reliability 

15 may be acceptable for an embedded device, for example, providing a calendar and email client to a single user. A 
peer group service, on the other hand, may include a collection of cooperating peer services. If one peer service 
fails, the collective peer group service may not be affected, because chances are that one or more of the other peer 
services are healthy. Thus, a peer group service may provide consumers (client peers) a highly reliable, fault-tolerant 
cluster of identical service implementations, servicing multiple concurrent peer requests. Services of this kind may 

20 be defined as content within the peer group. Specific service instances (as represented by service advertisements) 
may be obtained using the peer information protocol. In one embodiment, peers have the option of contacting a 
specific service instance using the peer information protocol, or by contacting a group of services through a special 
active content policy. 

One embodiment of the peer-to-peer platform may use advertisements. Advertisements are language- 

25 neutral abstract data structures. In one embodiment, advertisements may be defined in a markup language such as 
XML. In one embodiment, in accordance with a software platform binding, advertisements may be converted to and 
from native data structures such as Java objects or 'C structs. In one embodiment, each protocol specification may 
describe one or more request and response message pairs. Advertisements may be documents exchanged in 
messages. The peeMo-peer platform may defines standard advertisement types including, but not limited to, policy 

30 advertisements, peer advertisements, peer group advertisements, pipe advertisements, service advertisements, and 
content advertisements. In one embodiment, subtypes may be formed from these basic types using schemas (e.g. 
XML schemas). Subtypes may add extra, richer metadata such as icons. In one embodiment, the peer-to-peer 
platform protocols, policies, and core software services may operate only on the basic abstract types. 

In one embodiment, all peer-to-peer platform advertisements are represented in XML, XML may provide a 

35 means of representing data and metadata throughout a distributed system. XML may provide universal (software- 
platform neutral) data because it may be language agnostic, self-describing, strongly-typed and may ensure correct 
syntax. In one embodiment, the peer-to-peer platform may use XML for platform resource advertisements and for 
defining the messages exchanged in the protocol set. Existing content types (MIME) may be described using a level 
of indirection called metadata. All XML Advertisements may be strongly typed and validated using XML schemas. 

40 In one embodiment, only valid XML documents that descend from the base XML advertisement types may be 
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accepted by peers supporting the various protocols requiring that advertisements be exchanged in messages. Another 
feature of XML is its ability to be translated in to other encodings such as HTML and WML. In one embodiment, 
this feature of XML may be used to provide support for peers that do not support XML to access advertised 
resources. 

In one embodiment, advertisements may be composed of a series of hierarchically arranged elements. Each 
element may contain its data and/or additional elements. An element may also have attributes. Attributes may be 
name-value string pairs. An attribute may be used to store metadata, which may be used to describe the data within 
the element. 

In one embodiment, peer-to-peer platform advertisements may contain elements including, but not limited 

to: 

• default language encoding element - in one embodiment, all human readable text strings are assumed to be 
of this encoding, unless otherwise denoted. As an example: 

<default Language>en-CA<default Language> 

• resource name (canonical name string containing a UUID) - in one embodiment, a unique 128-bit number 
naming the resource within the platform; and 

• one or more <Peer Endpoint> elements used to access the resource. Peer endpoint elements may contain a 
network transport name (for example, a string followed by a '://') and a Peer address on transport (for 
example, a string). 

Peer-to-peer platform advertisements may also contain one or more optional elements including, but not ""7 
limited to, a resource provider description element and a resource provider security policy element. A resource 
provider description element may be a standard element that describes the provider of the resource. A resource 
provider security policy element may be a standard element that describes the provider's security. 

A resource provider description element may include, but is not limited to: 

• a title (non-canonical string suitable for UI display) 

• a provider name (canonical name string containing a UUID) 

• a version (a string) 

• a URI to obtain additional Info (a string) 

For example, a light switch service provider's description element might be: 

<title>ABC Programmable Lighting Switch</title> 

<provider>ABC, an XYZ Cornpany</provider> 

<version>l .0</version> 

<adm"honalIiifo>ht^ 

In one embodiment, the same set of descriptive information (title, provider name, version, and additional 
info URI) may be used throughout all advertisement types to describe the particular provider. 

A resource provider security policy element may include, but is not limited to: 

• an authentication policy - an embedded policy advertisement that describes the manner in which this 
provider authenticates others; and 
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